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1.  Introduction 


The  National  Research  Council  (NRC)  lists  stratospheric  ozone  and  the  species  that  control  its 
catalytic  destruction  as  a  key  research  challenge  facing  the  atmospheric  chemistry  community  in 
the  21st  century  (NRC,  1998).  The  physical  properties  of  aerosols  and  clouds  affect 
stratospheric  chemistry  in  general  and  stratospheric  ozone  in  particular.  Because  chemical 
reactions  that  take  place  on  the  surfaces  of  aerosols  and  polar  stratospheric  clouds  (PSCs)  have 
been  connected  to  the  observed  springtime  ozone  depletion  in  both  the  Arctic  and  Antarctic,  it  is 
important  to  know  PSC  phase,  size,  and  chemical  composition.  Laboratory  studies  have  shown 
that  the  heterogeneous  reactions  that  occur  on  the  different  types  of  PSCs  are  all  effective  for 
activating  chlorine  but  with  different  efficiencies  (Carslaw  et  ah,  1997).  The  ability  to  accurately 
classify  PSCs  according  to  their  physical  characteristics  will  assist  in  our  studying  their 
reactivity.  In  addition,  using  an  automated  method  to  classify  PSCs  as  opposed  to  the  more 
subjective  approaches  used  in  the  past  can  result  in  a  more  efficient  method  for  analyzing  large 
data  sets,  particularly  those  from  space-borne  LIDARS  (laser  identification  and  ranging  system). 

LIDARS  have  been  used  to  study  PSCs  in  the  Arctic  and  Antarctic  (McCormick  et  al.,  1981, 
Poole  &  McCormick,  1988;  Kent  et  al.,  1990;  Browell  et  al.,  1990;  Toon  et  al.,  1990;  Tabazadeh 
&  Toon,  1996;  Tsias  et  al.,  1999;  Toon  et  al.,  2000). 

These  studies  have  yielded  many  PSC  particle  classifications  that  were  (at  least  initially)  based 
on  apparent  clustering  of  LIDAR  observables:  scattering  ratio,  aerosol  depolarization,  and  color 
ratio.  The  boundaries  between  the  clusters  were  defined  somewhat  subjectively  by  scientists 
analyzing  the  data  and  were  subsequently  corroborated  by  deductions  of  microphysical 
distinctions  (particle  size,  shape,  and  composition)  suggested  by  the  LIDAR  observables  and  the 
temperatures  at  which  the  PSCs  were  observed.  The  most  common  of  these  classifications  is 
presented  in  table  1 . 


Table  1.  Summary  of  PSC  particle  classifications. 


PSC  Classification 

Description 

la 

Large  nitric  acid  trihydrate  crystals 

la-enhanced 

NAT  crystals  down  wind  from  mountain  wave-induced  ice  clouds 

lb 

Liquid  ternary  solution  particles 

Ic 

la  and  lb  mix 

II 

Ice  crystals 

In  this  study,  the  extensive  set  of  PSC  measurements  was  acquired  during  the  stratospheric 
aerosol  and  gas  experiment  (SAGE)  III  ozone  loss  and  validation  experiment  (SOLVE)  and  was 
used  as  the  basis  to  classify  PSCs.  Well-known  statistical  techniques  were  employed  in  an 
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attempt  to  more  objectively  identify  the  number  of  PSC  types  and  their  corresponding 
characteristics. 

This  technique  is  in  contrast  to  many  of  the  methods  used  in  the  past  which  used  a  more 
subjective  approach  that  relied  on  the  researchers’  ability  to  discriminate  between  particle  types, 
based  upon  their  knowledge  of  the  parameters  used  in  their  analysis.  The  method  presented  here 
is  potentially  a  more  efficient  method  for  classifying  and  analyzing  the  PSCs  within  the 
extensive  data  sets  that  are  produced  by  space-based  LIDAR  missions  such  as  the  cloud-aerosol 
LIDAR  and  infrared  pathfinder  satellite  observations  (CALIPSO). 


2.  SOLVE  Aerosol  LIDAR  Data 


SOLVE  was  a  measurement  campaign  focused  on  the  processes  that  control  ozone 
concentrations  at  middle  and  high  latitudes.  The  mission  included  the  deployment  of  several 
aircraft-based  and  balloon-borne  instruments  and  was  based  in  Kiruna,  Sweden,  during  the 
winter  of  1999-2000.  This  study  employed  data  acquired  by  the  LaRC  aerosol  LIDAR,  which 
was  deployed  on  the  National  Aeronautics  and  Space  Administration  (NASA)  DC-8  aircraft  on 
the  SOLVE  mission.  The  LaRC  aerosol  LIDAR  is  a  piggy-back  instrument  on  the  NASA 
Goddard  Space  Flight  Center  (GSFC)  airbomer  Raman  ozone,  temperature,  and  aerosol  LIDAR 
(AROTAL).  The  LIDAR  data  used  in  this  study  were  limited  to  the  following:  total  scattering 
ratio  at  532  nm  (R532),  total  scattering  ratio  at  1064  mn  (R1064),  aerosol  backscatter  coefficient  at 
532  ((3532),  aerosol  backscatter  coefficient  at  1064  nm  (P1064),  and  aerosol  depolarization  ratio  at 
532  nm  (6532).  Each  measurement  in  the  data  set  pertains  to  an  observed  ensemble  of  particles. 

The  backscatter  ratio  provides  infonnation  about  the  ratio  of  the  backscatter  from  aerosols  to  the 
backscatter  from  molecules,  and  the  aerosol  backscatter  coefficient  is  a  function  of  the 
concentration  and  size  of  the  aerosols.  Therefore,  for  an  ensemble,  a  high  scattering  ratio  implies 
that  there  is  more  backscatter  attributable  to  aerosols  than  to  molecules,  and  a  high  aerosol 
backscatter  coefficient  implies  a  high  concentration  of  aerosols  or  the  presence  of  large  aerosols. 
Also,  infonnation  about  the  shapes  of  the  particles  in  the  ensemble  can  be  obtained  from  the 
aerosol  depolarization  ratio.  A  high  aerosol  depolarization  ratio  indicates  that  non-spherical 
particles  are  a  part  of  the  ensemble  being  measured  which  is  associated  with  solid  particles  as 
opposed  to  ensembles  of  liquid  particles  that  produce  low  aerosol  depolarization  ratios.  The 
color  ratio  used  in  this  analysis  is  defined  as  P532/P1064  and  provides  information  about  the  sizes 
of  the  particles  in  the  ensemble  attributable  to  the  wavelength  dependence  for  particle  scattering. 
Therefore,  a  high  color  ratio  indicates  the  presence  of  small  particles  within  the  ensemble  and  a 
small  color  ratio  indicates  the  presence  of  large  particles  within  the  ensemble.  The  vertical  and 
horizontal  resolutions  of  the  data  products  used  in  this  study  were  75  m  and  2.5  km,  respectively. 
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Filters  were  applied  to  the  SOLVE  LIDAR  data  to  screen  the  PSC  data  for  use  in  the  cluster 
analysis.  Only  data  acquired  during  nighttime  lighting  conditions  were  used  to  ensure  the 
greatest  signal-to-noise  ratio  and  thus  the  lowest  chance  of  misidentifying  a  noise  excursion  as  a 
PSC.  The  altitude  range  was  restricted  to  14  to  26  km,  the  lower  limit  being  the  lowest  range  of 
the  LIDAR  data  and  the  upper  limit  set  at  the  maximum  altitude  at  which  PSCs  were  observed 
during  the  SOLVE  mission.  The  scattering  ratios  were  restricted  to  R532  >1.12  and  R1064  >1.6 
to  isolate  PSC  data  for  use  in  the  analysis,  and  only  temperatures  below  198  K  were  included. 
The  final  number  of  data  points  satisfying  these  filters  and  used  as  objects  in  the  cluster  analysis 
is  18,275. 

During  the  SOLVE  mission,  there  were  occasions  when  strong  signals  from  PSCs  saturated  the 
detection  system  (i.e.,  the  signal  level  exceeded  the  range  of  an  amplifier  or  analog-to-digital 
converter),  making  the  measurements  inaccurate.  Unfortunately,  it  is  not  possible  in  post-flight 
analysis  to  detennine  which  data  were  corrupted  by  saturation,  and  this  may  have  affected  some 
of  the  results  presented  in  this  report. 


3.  Method 


3.1  Principal  Component  Analysis 

Principal  component  analysis  (PCA)  is  a  statistical  method  used  to  resolve  the  complicated 
variance  of  the  multivariate  set  of  data  consisting  of  P532,  (3io64,  R532,  R1064,  5a,  P532/P1064,  and 
temperature  (T).  Temperature  profiles  were  derived  from  the  global  modeling  and  assimilation 
office  (GMAO)  meteorological  model.  The  analysis  was  done  with  the  standardized  values  of 
the  variables,  which  makes  them  equally  important  by  creating  new  variables  that  each  have  a 
mean  of  zero  and  a  variance  of  one.  The  PCA  essentially  redefines  the  data  set  in  terms  of 
derived  variables  that  are  based  on  linear  combinations  of  the  original  variables  (Preisendorfer, 
1988).  Each  data  point  in  the  derived  variable  data  set  corresponds  to  exactly  one  data  point  in 
the  original  data  set. 

The  weight  assigned  to  each  derived  variable  is  also  a  part  of  the  output  of  PCA.  Thus,  the 
derived  variables  that  account  for  large  portions  of  the  variance  of  the  data  can  be  identified, 
thereby  reducing  the  dimensions  of  the  analysis  by  disregarding  the  derived  variables  that 
account  for  a  minimal  variance.  Consequently,  information  concerning  the  variance  of  each 
original  variable  may  also  be  obtained.  A  more  detailed  description  of  the  method  is  provided  by 
Felton  (2003). 

3.2  Cluster  Analysis 

A  clustering  algorithm  was  used  to  classify  the  clouds  into  groups,  based  on  combinations  of  the 
derived  variables  from  the  PCA.  A  particular  partitioning  method  has  been  chosen  to  perfonn 
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the  cluster  analysis.  The  technique  is  based  on  the  search  for  representative  objects  among  the 
many  objects  of  the  data  set  called  “medoids”  (Struyf  et  ah,  1996).  The  medoids  are  calculated 
so  that  the  total  dissimilarity  of  all  objects  to  their  nearest  medoid  is  minimal.  A  range  of  values 
for  the  number  of  clusters  (k)  desired  is  required  as  input  for  the  clustering  algorithm.  The 
natural  number  of  clusters  can  be  obtained  from  analysis  of  a  quality  index  calculated  for  each 
cluster  as  well  as  the  corresponding  graphical  output  (Kaufman  &  Rousseeuw,  1990).  This 
index,  named  the  “silhouette  coefficient”  (SC)  by  the  authors,  provides  an  indication  of  the 
relationship  between  the  objects  of  a  cluster.  The  authors’  suggested  interpretation  of  the  values 
of  the  silhouette  coefficient  is  that  0.71  to  1.00  imply  that  the  clusters  are  well  defined;  0.51  to 
0.71  imply  that  the  clusters  are  reasonably  defined;  0.26  to  0.50  imply  that  the  clusters  are  poorly 
defined;  and  <  0.25  imply  that  no  substantial  groupings  have  been  found. 

Note  that  it  is  possible  to  obtain  a  relatively  high  silhouette  coefficient  for  a  value  of  k  that  is  not 
the  most  natural  number  of  clusters  for  the  data  set. 

For  this  reason,  Kaufman  and  Rousseeuw  (1990)  suggested  that  the  value  of  k  that  yields  the 
highest  silhouette  coefficient  should  not  be  selected  as  the  natural  number  of  clusters  for  the  data 
set  unless  the  graphical  output  is  also  examined.  The  groups  should  be  such  that  the  degree  of 
association  is  strong  between  members  of  the  same  cluster  and  weak  between  members  of 
different  clusters.  An  example  of  this  partitioning  method  is  shown  in  figure  1 . 
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Figure  1.  An  example  of  the  partitioning  method  where  the  Vf  are  data  points  used  in  the 
analysis.  (A  and  B  denote  the  medoids  of  the  clusters.  Lj  are  distances  from 
medoids  to  members  of  the  cluster.  The  algorithm  minimizes  this  distance. 

This  example  is  a  2-D  representation  of  the  clustering  space.) 

3.3  Simulation 

To  test  the  clustering  algorithm,  random  numbers  were  generated  that  are  normally  distributed 
about  the  means  for  the  R532  and  aerosol  depolarization  ratios  of  the  PSC  types  defined  by 
Browell  et  al.  (1990).  Thus,  the  test  data  were  modeled  after  types  la,  lb,  and  II  PSCs.  There 
were  an  equal  number  of  points  for  each  PSC  type,  and  nonphysical  points  were  not  included  in 
the  simulation  because  they  were  excluded  from  the  data  set  altogether.  The  algorithm  was 
asked  to  partition  the  data  into  two  to  five  clusters  (k  =  2. .  .5).  The  means  and  standard 
deviations  used  to  generate  the  data  are  given  in  table  2.  The  resulting  silhouette  coefficients  for 
the  partitioned  test  data  are  reported  in  table  3.  As  suggested  by  Kaufman  and  Rousseeuw 
(1990),  the  silhouette  coefficients  and  the  actual  graphical  results  of  the  clustered  data  should  be 
used  to  assess  the  validity  of  the  classification. 
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Table  2.  Mean  (p)  and  standard  deviations  (o)  for 
R532  and  5a  used  to  generate  the  test  data. 


Type 

^^532 

<JR 

K522 

la 

1.32 

0.20 

0.400 

0.100 

lb 

5.00 

1.00 

0.015 

0.010 

II 

15.00 

3.50 

0.200 

0.035 

Table  3.  Mean  values,  |i,  and  standard  deviations,  a,  for  R532 
and  5a  for  the  three  clusters  resulting  from  the  k  =  3 
partition  of  the  test  data. 


Cluster 

^^532 

O’  n 

K522 

Mgm 

black 

1.414 

0.426 

0.403 

0.097 

red 

4.975 

1.177 

0.022 

0.028 

yellow 

15.035 

3.257 

0.199 

0.034 

The  cluster  plots  for  k  =  2. .  .5  are  shown  in  figures  2  a,  b,  c,  and  d.  In  the  case  of  k  =  2,  one 
cluster  is  formed  as  the  union  of  the  type  lb  and  type  II  PSCs.  Although  the  silhouette 
coefficient  of  0.57  suggests  that  these  may  be  reasonably  defined  clusters,  we  know  that  type  lb 
and  type  II  PSCs  are  distinct  from  each  other  because  of  their  depolarization  values.  When  k  = 

3,  all  three  PSC  types  are  accurately  identified.  Despite  outlying  points  having  a  negative  effect 
on  the  corresponding  silhouette  coefficient,  a  value  of  0.73  suggests  that  strong  clusters  have 
been  found.  For  k  =  4,  the  type  la  and  lb  PSCs  are  accurately  identified  but  the  type  II  PSCs  are 
partitioned  into  two  clusters.  The  lack  of  dissimilarity  between  the  two  clusters  resulting  from 
the  partitioning  of  the  type  II  PSCs  results  in  a  lower  silhouette  coefficient  than  the  0.73  value 
for  k  =  3.  When  k  =  5,  only  the  type  lb  PSCs  are  accurately  identified  as  a  cluster.  The  type  II 
and  type  la  PSCs  are  partitioned  into  two  clusters  each.  The  corresponding  silhouette  coefficient 
decreases  to  0.62  because  of  the  lack  of  separation  between  the  clusters  resulting  from  the 
partitioning  of  the  type  la  and  type  II  PSCs. 


6 


a.  b. 


d. 


Figure  2.  Test  data  partitioned  into  k  =  2. .  .5  clusters  (a.  k  =  2  where  the  yellow  cluster  is 
the  union  of  type  lb  and  type  II  and  the  black  cluster  is  type  la.  b.  k  =  3  where 
the  black  cluster  is  type  la,  the  red  is  type  lb,  and  the  yellow  is  type  II.  c.  k  =  4 
where  the  black  cluster  is  type  la,  the  red  is  type  lb,  and  the  blue  and  yellow 
represent  the  type  II  split  into  two  separate  clusters,  d.  k  =  5  where  the  red  is  type 
lb,  the  black  and  green  represent  type  la  split  into  two  separate  clusters,  and  the 
blue  and  yellow  represent  type  II  split  into  two  separate  clusters.) 

Thus,  the  silhouette  coefficient  of  0.73  suggests  that  three  is  the  most  natural  number  of  clusters 
for  the  test  data  set.  When  we  look  at  the  corresponding  cluster  plot,  we  see  that  all  three  types 
of  PSCs  included  in  the  test  are  accurately  identified  as  separate  clusters.  Also,  from  examining 
the  cluster  plots  of  the  other  values  of  k,  we  see  that  the  silhouette  coefficient  is  a  function  of  the 
homogeneity  of  the  clusters  as  well  as  their  relative  dissimilarities  to  each  other.  Table  3  shows 
the  mean  values  and  standard  deviations  of  the  clusters  resulting  from  the  k  =  3  partition.  We 
can  also  see  that  the  algorithm  has  successfully  identified  the  type  la,  lb,  and  II  PSCs  (black,  red, 
and  yellow  clusters,  respectively)  by  comparing  the  resulting  values  in  table  3  to  the  values  used 
to  generate  the  test  data  in  table  2.  However,  although  the  yellow  and  black  clusters  are  very 
homogeneous,  the  red  cluster  does  contain  some  data  points  that  do  not  have  lb  characteristics. 
For  instance,  type  la  particles  with  low  depolarization  are  included  in  the  red  cluster.  In 
addition,  particles  with  a  mixture  of  Type  la  and  lb  properties  (R  ~  6  and  5  ~  0.15)  are  included 
in  the  red  cluster. 
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4.  Results 


Different  combinations  of  R532,  R1064,  8a,  P532,  Pi064,  P532/P1064,  and  T  were  used  as  input  for  the 
analysis  to  see  which  combination  of  variables  is  best  for  identifying  the  different  types  of  PSCs. 
The  variable  combinations  that  included  at  least  three  of  the  four  scattering  variables  (R532,  R1064, 
P532,  and  P1064)  resulted  in  the  highest  separation  between  clusters.  These  variable  combinations 
are  shown  in  table  4.  The  results  of  performing  the  analysis  on  variable  combinations  with  just 
scattering  ratio  or  backscatter  coefficient  are  presented  in  Felton  (2003).  These  variable 
combinations  resulted  in  clusters  that  have  poor  separation  with  respect  to  each  other.  In  all 
cases,  PCA  was  used  to  reduce  the  dimensions  of  the  analysis  while  still  capturing  at  least  95% 
of  the  total  variance  of  the  data  set.  The  results  are  presented  in  three  sections.  The  first  section 
consists  of  the  results  of  the  analysis  performed  on  data  from  all  1 1  SOLVE  flights  for  which 
acceptable  data  were  available.  Analyzing  large  amounts  of  data  in  this  manner  shows  the 
method’s  usefulness  in  the  processing  of  satellite  data  in  real  time  for  a  first  order  classification 
of  PSC  types.  The  second  and  third  sections  focus  on  individual  days  that  have  a  significant 
difference  in  their  lowest  temperatures. 

Table  4.  The  different  variable  combinations  that  have  been  used  in  the 
analyses. 


Combination  No. 

Variables 

A 

R532,  R-1064,  Sa,  P532,  P 1 064,  P532/  Pl064,  T 

B 

R532,  Rl064,  §a,  P532,  P 1 064,  P532/  Pl064 

C 

R532,  5a,  P532,  Pi 064,  P532/  Pi 064,  T 

4,1  Full  Data  Set 

The  percent  of  the  variance  attributed  to  each  derived  variable  used  in  the  analysis  for  variable 
combinations  A,  B,  and  C  is  shown  in  table  5.  For  each  combination,  the  main  contributors  to 
derived  variables  1,  2,  and  3  are  the  scattering  variables,  aerosol  depolarization  and  color  ratio, 
and  temperature,  respectively.  Therefore,  for  A,  the  scattering  variables  account  for 
approximately  56%  of  the  variance  while  aerosol  depolarization  and  color  ratio  accounts  for 
approximately  24%  of  the  variance.  When  the  PCA  was  perfonned  on  B,  the  variance  accounted 
for  by  the  scattering  variables  increased  by  roughly  7%  and  the  variance  accounted  for  by 
aerosol  depolarization  and  color  ratio  increased  by  less  than  4%.  For  C,  the  variance  accounted 
for  by  the  scattering  variables  decreased  to  50%  of  the  total  variance  and  that  of  aerosol 
depolarization,  color  ratio,  and  temperature  increased  slightly. 
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Table  5.  The  percentage  variance  of  the  data  set  attributed  to  each  derived  variable 
used  in  the  analyses  for  variable  combination  A,  B,  and  C. 


Derived 

Variable 

Percent  variance 
for  A 

Percent  variance 
for  B 

Percent  variance 
for  C 

1 

56 

63 

50 

2 

24 

28 

28 

3 

12 

5.9 

14 

4 

5.0 

N/A 

5.8 

Comparisons  of  the  analysis  performed  on  variable  combinations  A,  B,  and  C  are  shown  in 
figures  3  and  4.  The  presence  of  liquid  or  solid  particles  within  the  ensembles  can  be  inferred 
from  R532  versus  8a  plots  of  figure  3,  and  the  presence  of  large  or  small  particles  within  the 
ensembles  can  be  inferred  from  the  R532  versus  P532/P1064  plots  of  figure  4.  For  each  variable 
combination,  the  data  set  was  partitioned  into  k  =  2,  3,  . .  .6  clusters  and  the  partitions 
corresponding  to  the  three  highest  silhouette  coefficients  are  presented. 


9 


k=  2  and  SC=  0.49 

□.SO 

.9 

1  0.40 

A  H 
}o.2D 

*5 

e  o.io 

s 

0.00 

2  4  6  B  ID  12 

632  nm  scattaring  ratio 


k=  3  and  SC=  0.3B 

□.SO 
.9 

1  0.40 

1  0.30 
'c 
D 

}o.20 

"5 

E  0.10 

8 

0.00 

Z  4  B  B  ID  12 
632  nm  scattaring  ratio 


B 


k=  2  and  3C=  0.52 

□.SO 
.9 

1  0.40 

1  0.30 
'c 
o 

}o.20 

"n 

e  o.io 

s 

0.00 

Z  4  B  B  ID  12 
632  nm  scattaring  ratio 


k=  3  ond  SC=  0.45 


□ 

'-P 

i 

'c 

o 

1 

-8 

"□ 

o 

E 

8 


□.so 


0.40 


0.30 


0.20 


0.10 


0.00 


Z  4  B  B  ID  12 
632  nm  scattaring  ratio 


k=  2  and  SC=  0.3B 

□.SO 
.9 

1  0.40 

c 

,□ 

C  I  0.30 

5 

}o,20 

□ 

E  0.10 

8 

0.00 

Z  4  6  B  ID  12 

632  nm  scattaring  ratio 


'■P 

i 

'c 

o 

1 

-8 

"5 

D 

E 

8 


k=  3  ond  SC=  0.39 


Z  4  B  B  ID  12 
632  nm  scattaring  ratio 


k=  4  ond  SC=  0.37 

□.SO 

.9 

1  0.40 
1  0.30 

'C 

o 

}o.20 

E  0.10 

8 

0.00 

Z  4  B  B  ID  12 
632  nm  scattaring  ratio 

□.SO 

.9 

1  0.40 
1  0.30 

'C 
D 

}o,20 

“5 

E  0.10 

8 

0.00 

Z  4  B  B  ID  12 
632  nm  scattaring  ratio 

□.SO 
.9 

1  0.40 

1  0.30 
c 

D 

}o,20 
□ 

E  0.10 

8 

0.00 

Z  4  B  B  ID  12 
632  nm  scattaring  ratio 


k=  4  ond  SC=  a.41 


k=  4  ond  SC=  0.47 


Figure  3.  Comparisons  of  R532  versus  5a  for  the  analysis  performed  on  variable  combinations  A,  B,  and  C. 
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Figure  4.  Comparisons  of  R532  versus  P532/P1064  for  the  analysis  performed  on  variable  combinations  A,  B, 
and  C. 

The  clusters  are  represented  by  color. 


For  A,  in  figures  3  and  4,  the  SC  of  0.49  suggests  that  k  =  2  is  the  best  partition  for  the  data  set. 
Since  all  of  the  scattering  variables  were  used,  particle  scattering  dominates  the  partitioning  and 
separates  low  and  high  scattering  measurements  indicated  by  the  black  and  red  clusters, 
respectively.  The  clusters  contain  measurements  with  depolarizing  and  non-depolarizing 
particles  suggesting  that  they  are  not  entirely  homogeneous.  When  k  =  3,  the  low  scattering 
measurements  are  separated  according  to  depolarization  indicated  by  red  and  black  clusters  and  a 
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yellow  cluster  continues  to  consist  of  non-depolarizing  and  depolarizing  measurements.  When  k 
=  4,  the  yellow  cluster  becomes  entirely  non-depolarizing  measurements  and  a  purple  cluster 
emerges  that  is  poorly  defined,  consisting  of  outlying  measurements  with  very  high  R532  that  are 
non-depolarizing  with  high  color  ratios  or  depolarizing  with  smaller  color  ratios. 

The  results  for  B  in  figures  3  and  4  are  almost  identical  to  those  of  A.  Eliminating  T  from  the 
analysis  resulted  in  increased  SCs,  which  suggests  that  many  of  the  clusters  may  occur  at  similar 
temperatures.  This  combination  of  variables  also  reduces  the  amount  of  measurements  with 
small  non-depolarizing  particles  in  the  purple  cluster  for  k  =  4. 

For  C  in  figures  3  and  4,  eliminating  R1064  adds  more  weight  to  8a.  When  k  =  2,  the  data  are  now 
partitioned  with  respect  to  5a  in  addition  to  R532.  The  clusters  that  result  from  the  k  =  3  partition 
remain  the  same  as  the  three  clusters  in  A  and  B.  The  highest  SC  occurs  when  k  =  4  and  the 
clusters  are  now  very  homogeneous  except  for  the  purple  cluster,  which  still  contains  the 
outlying  measurements  containing  large  particles  with  high  R532. 

Variable  combination  C  results  in  clusters  that  best  represent  the  known  PSC  types.  For  variable 
combinations  A  and  B  (which  include  R1064  in  the  analysis)  in  figures  3  and  4,  the  k  =  4  partition 
results  in  a  purple  cluster  that  contains  measurements  of  ensembles  with  large  particles  centered 
at  8a=  0.3  as  well  as  few  small  non-depolarizing  particles.  Although  many  of  these  depolarizing 
measurements  have  lower  R532  values  than  the  non-depolarizing  measurements,  both  have  similar 
R1064  values  because  the  depolarizing  measurements  contain  large  particles  and  are  more  efficient 
at  scattering  at  1064  mn.  Therefore,  including  R1064  impedes  the  accurate  classification  of  the 
non-depolarizing  measurements  with  high  scattering  ratio  values.  Therefore,  variable 
combination  C  will  be  used  in  the  analyses  of  the  individual  days  in  the  following  sections. 

The  black  cluster  that  results  from  the  use  of  variable  combination  C  to  partition  the  data  set  into 
four  clusters  (figures  3  and  4)  consists  of  measurements  of  ensembles  with  small  particles  and 
has  mean  values  of  Rs32=  1.252  and  8a=  0.041.  These  particles  are  centered  around  the  origin 
and  are  most  likely  the  aerosols  that  are  precursors  to  PSCs.  The  red  cluster  resembles  type  la 
measurements  with  mean  values  of  R532=  1.23  5  and  8a  =  0.231  and  the  yellow  cluster  resembles 
type  lb  measurements  having  mean  values  of  Rs32=  2.503  and  8a  =  0.024.  Most  of  the  purple 
cluster,  centered  at  R532  =  2.5  and  8a=  0.3,  consists  of  measurements  that  resemble  type  la- 
enhanced.  Included  in  this  cluster  are  measurements  that  have  very  high  R532  as  do  the  type  II 
PSCs. 

Not  shown  in  figures  3  and  4  are  the  k  >  4  partitions  where  the  algorithm  partitions  other  clusters 
further  before  successfully  identifying  the  measurements  indicative  of  the  type  II  particles.  Two 
factors  prevent  the  algorithm  from  successfully  identifying  these  measurements.  First,  type  II 
measurements  are  far  less  numerous  than  other  types  of  PSC  measurements  and  tend  to  look  like 
outliers  of  the  la-enhanced  cluster.  The  filtered  data  set  has  a  total  of  18,275  data  points  while 
the  number  of  data  points  with  Rs32>  5  is  176.  Secondly,  the  type  II  PSC  measurements  share 
optical  characteristics  such  as  high  8a  and  low  (I532/P1064  with  the  type  la  enhanced. 
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4.2  CA  For  January  23,  2000 


The  cluster  algorithm  partitioned  the  6,089  measurements  (objects)  from  January  23,  2000,  into  k 
=  2. .  .6  clusters.  A  flight  map  is  provided  in  figure  5.  Some  of  the  measurements  on  this  day 
occurred  during  periods  when  the  temperature  was  less  than  188  K. 


Figure  5.  January  2,  2000  flight  map. 

Variable  combination  C  was  used  in  the  analysis,  and  the  resulting  SCs  are  shown  in  table  6. 
Figure  6  shows  R532  versus  5aand  figure  7  shows  R532  versus  P532/P1064  for  the  three  highest 
values  of  k  where  the  blue  stars  are  the  representative  objects  (medoids)  for  each  cluster.  The 
two  clusters  resulting  from  the  k  =  2  partition  are  not  representative  of  the  number  of  PSC  types 
present  because  they  are  composed  of  measurements  of  ensembles  with  small  and  large  particles 
as  well  as  liquid  and  solid  particles.  When  k  =  4,  the  SC  reaches  its  maximum  value  of  0.40.  A 
black  cluster  with  low  R532  and  low  to  moderate  5a  and  a  yellow  cluster  with  low  to  moderate 
R532  and  low  5a  are  found.  Other  resulting  clusters  include  a  red  cluster  with  low  to  moderate 
R532,  moderate  to  high  5a,  and  moderate  to  high  P532/P1064  and  a  purple  cluster  with  high  R532  and 
5a,  and  low  to  moderate  P532/P1064-  There  is  good  separation  between  the  medoids  of  these 
clusters  except  for  the  yellow  and  black  clusters.  Partitioning  the  data  set  five  times  results  in  a 
decline  in  SC  to  0.35  because  of  the  lack  of  separation  between  the  red  and  green  clusters. 
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Table  6.  Silhouette  coefficients  for  2  to  6  clusters  of  the  January  23  data  set. 


No.  of 
clusters 

SC 

2 

0.38 

3 

0.33 

4 
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Figure  6.  R532  versus  5a  for  k  =  2,  4,  and  5  for  the  January  23  data  set.  (The  blue  stars  are  the  representative 
objects  of  the  clusters.) 
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Figure  7.  R532  versus  Ps32/Pio64  for  k  =  2,  4,  and  5  for  the  January  23  data  set.  (The  blue  stars  are  the  representative 
objects  of  the  clusters.) 


Therefore,  the  optimum  number  of  clusters  for  January  23,  2000,  is  four.  The  mean  values  of 
these  four  clusters  are  shown  in  table  7.  Despite  having  low  R532,  the  black  cluster’s  low  5a 
suggests  that  not  all  of  the  particles  in  these  ensembles  are  the  type  la  described  by  Browell  et  al. 
(1990).  From  figure  6,  it  can  be  seen  that  there  are  many  measurements  in  the  black  cluster  with 
5a  <  5%,  which  suggests  the  presence  of  more  liquid  particles  than  solid  type  la  particles.  Thus, 
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the  black  cluster  seems  to  be  composed  of  a  mixture  of  type  la  and  precursor  aerosol  particles. 
The  red  cluster  matches  very  well  with  the  type  Ia-enh  described  by  Tsias  et  al.  (1999)  and 
Reichardt  et  al.  (2000)  which  has  both  moderate  R532  and  5a.  The  measurements  of  the  yellow 
cluster  that  contain  small  spherical  particles  are  predominantly  the  type  lb  described  by  Browell 
et  al.  (1990). 


Table  7.  Mean  values  of  R532,  5a,  and  p532/pi064  for  k  =  4  clusters  for  January  23  data  set. 


Cluster 

Mean  R532 

Mean  5a 

Mean  P532/  P1064 

Mean  T 

Black 

1.386 

0.064 

2.962 

192 

Red 

2.365 

0.284 

1.985 

191 

Yellow 

2.791 

0.024 

3.492 

188 

Purple 

12.563 

0.190 

1.844 

184 

The  purple  cluster  is  the  type  II  PSCs  described  by  Browell  et  al.  (1990)  with  its  high  R532  and 
moderate  5a.  This  PSC  type  was  not  identified  as  a  separate  cluster  when  the  entire  data  set  was 
used  as  input  to  the  analysis.  Using  the  smaller  number  of  data  points  in  the  January  23  data  set 
made  it  possible  to  cluster  this  class  separately,  probably  because  type  II  PSCs  represent  a  larger 
fraction  of  the  observations  from  the  flight  of  23  January  than  they  do  in  the  composite  of 
observations  from  all  the  flights. 

Figure  8  is  an  image  plot  for  January  23,  2000,  which  illustrates  the  spatial  proximity  of  the 
different  types  of  PSCs  in  the  CA.  The  first  three  panels  in  the  figure  show  R532,  5a,  and 
P532/P1064,  and  the  last  panel  shows  the  classification  of  the  PSC  measurements  resulting  from  the 
CA.  The  type  la  and  precursor  mixture,  type  Ia-enh,  type  lb,  and  type  II  correspond  to  black, 
red,  yellow  and  purple,  respectively.  On  this  particular  day,  the  aerosols  were  found  along  the 
outer  edges  of  the  type  la-enhanced,  lb,  and  II  PSCs,  and  the  type  lb  and  type  II  PSCs  were 
found  adjacent  to  each  other  in  the  coldest  regions  of  the  cloud. 
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Figure  8.  Image  plots  for  January  23,  2000.  (Panels  a,  b,  and  c  show  R532,  5a,  and 
P532/P1064,  respectively,  and  panel  d  shows  the  classification  of  the  LIDAR 
data  into  clusters.  The  colors  are  as  follow:  black-  PSC  type  la  and 
precursor  aerosol  mixture,  red-  PSC  type  Ia-enh  yellow-  PSC  type  lb,  and 
purple-  PSC  type  II.  The  gap  in  d  represents  data  filtered  out  of  the  analysis 
because  of  excessive  noise  from  high  solar  background  light.) 


4.3  CA  For  March  5,  2000 

The  cluster  algorithm  was  used  to  partition  the  3,844  measurements  from  March  5,  2000  into  k  = 
2. .  .4  clusters.  The  flight  map  for  this  day  is  shown  in  figure  9.  The  temperatures  corresponding 
to  the  measurements  on  this  day  were  above  188  K.  The  resulting  silhouette  coefficients  and  the 
graphical  output  are  shown  in  figures  10  and  11. 
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Figure  9.  March  5,  2000  flight  map. 
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Figure  10.  R532  versus  5a  for  k  =  2,  3,  and  4  for  March  5  data  set.  (The  blue  stars  are  the  representative  objects  of 
the  clusters.) 
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Figure  1 1 .  R532  versus  Ps32/Pio64  for  k  =  2,  3,  and  4  for  March  5  data  set.  (The  blue  stars  are  the  representative 
objects  of  the  clusters.) 


Although  the  two  clusters  (k  =  2)  result  in  the  highest  SC,  the  range  of  depolarization  ratios 
suggests  that  the  black  cluster  must  consist  of  measurements  of  ensembles  with  solid  and  liquid 
particles.  When  k  =  3,  the  SC  drops  to  0.45  because  while  the  yellow  cluster  remains  the  same, 
the  black  cluster  from  k  =  2  has  been  partitioned  into  two  clusters  with  similar  R532  represented 
by  the  orange  and  black  clusters.  Despite  the  lower  SC,  the  k  =  3  partition  more  accurately 
describes  the  particles  present  because  the  measurements  with  liquid  particles  and  the 
measurements  with  solid  particles,  represented  by  the  black  and  orange  clusters,  respectively,  are 
now  classified  as  separate  clusters.  When  k  =  4,  the  SC  decreases  to  0.3 1  because  of  the 
extraneous  partition  resulting  in  the  green  cluster  whose  representative  object  is  very  similar  to 
that  of  the  black  cluster.  Three  is  therefore  the  most  appropriate  number  of  classes  for  the  PSCs 
observed  on  March  5,  2000.  The  mean  values  of  these  three  clusters  are  shown  in  table  8. 

Table  8.  Mean  values  of  R532,  5a,  and  p532/  Pi064  for  k  =  3  clusters  of  March  5  data  set. 


Cluster 

Mean  R532 

Mean  5a 

Mean  PS32/P1064 

Mean  T 

Black 

1.189 

0.034 

2.841 

194 

Orange 

1.211 

0.129 

1.847 

194 

Yellow 

1.879 

0.023 

2.914 

193 

The  black  cluster  has  optical  characteristics  very  similar  to  the  black  cluster  in  the  data  set  for 
January  23,  2000.  This  cluster  is  likely  a  mixture  of  solid  and  liquid  aerosol  particles.  The  mean 
values  for  the  orange  cluster  are  almost  identical  to  the  type  la  found  by  Browell  et  al.  (1990) 
with  R532  <1.5  and  5a  >  0. 1 .  The  presence  of  the  small  spherical  particles  of  the  yellow  cluster  is 
consistent  with  type  lb  particles  found  by  Browell  et  al.  (1990). 

Image  plots  for  March  5,  2000,  are  shown  in  figure  12.  The  first  three  panels  in  the  figure  show 
R532,  8a,  and  P532/P1064,  and  the  last  panel  shows  the  classification  of  the  PSC  particles  resulting 
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from  the  CA.  The  type  la  and  precursor  mixture,  type  la,  and  type  lb  correspond  to  black, 
orange,  and  yellow,  respectively.  The  aerosols  are  found  on  the  outer  edges  of  the  other  PSC 
types.  This  cloud  displays  a  layer  of  type  lb  PSCs  directly  above  a  layer  of  type  la  PSCs. 
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Figure  12.  Image  plots  for  March  5,  2000.  (Panels  a,  b,  and  c  show  R532,  5a,  and  P532/P1O64,  respectively,  and 

panel  d  shows  the  classification  of  the  LIDAR  data  into  clusters.  The  colors  are  as  follows:  black-  PSC 
type  la  and  precursor  aerosol  mixture,  yellow-  PSC  type  lb,  and  orange-  PSC  type  la.) 

4.4  Clusters  in  Analyses 

A  total  of  five  particle  types  has  been  identified  in  the  cluster  analyses  of  the  SOLVE  LIDAR 
data.  Four  of  the  clusters  are  known  PSC  types  and  the  remaining  cluster  resembles  a  mixture  of 
liquid  and  solid  background  aerosol  particles  that  are  precursors  to  the  PSC  particles.  This 
precursor  cluster,  identified  as  the  black  cluster  in  this  study,  was  often  found  on  the  outside 
edges  of  clouds  as  in  figures  8  and  12  and  is  similar  to  the  findings  of  Biele  et  al.  (2001).  All  five 
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PSC  types  were  observed  on  January  23  and  March  5  collectively.  On  the  other  nine  days, 
different  combinations  of  some  of  these  five  clusters,  summarized  in  table  9,  were  observed. 

Table  9.  Classification  and  mean  characteristics  of  the  clusters  found  on  January  23  and  March  5. 


Cluster 

Classification 

Mean  R532 

Mean  5a 

Mean 

0532/  P 1 064 

Mean  T 

Black 

Liquid/solid 
aerosol  mixture 

1.288 

0.049 

2.902 

193.001 

Orange 

la 

1.211 

0.129 

1.847 

194.097 

Red 

la-enhanced 

2.365 

0.284 

1.985 

190.929 

Yellow 

lb 

2.335 

0.024 

3.203 

190.795 

Purple 

II 

12.563 

0.190 

1.844 

184.159 

The  results  of  this  study  are  consistent  with  the  theories  on  the  PSC  particle  growth  continuum. 
Both  type  la  and  lb  PSCs  are  believed  to  start  from  stratospheric  background  aerosols,  i.e., 
aqueous  sulfuric  acid  solutions.  Tabazadeh  et  al.  (1994)  suggested  that  liquid  sulfate  aerosols 
resulted  in  type  lb  particle  fonnation  while  frozen  sulfate  aerosols  resulted  in  type  la  particles. 
Type  Ia-enh  clouds  are  characterized  by  nitric  acid  hydrate  particles  close  to  thermodynamic 
equilibrium  (Tsias  et  al.,  1999)  and  if  the  temperature  is  low  enough,  the  type  lb  clouds  freeze  to 
form  type  II  clouds  (Federico  et  al.,  2001).  Analysis  of  the  SOLVE  data  supports  these  claims. 
The  small  amount  of  type  II  PSCs  and  many  of  the  type  lb  PSCs  were  observed  on  days  with 
extremely  cold  temperatures  (T  <  188  K).  In  the  image  plots  for  January  23  in  figure  8,  a  region 
of  type  II  particles  (purple)  is  found  embedded  inside  a  type  lb  cloud  (yellow).  Figure  13  shows 
temperature  versus  5a  for  January  23  and  illustrates  the  occurrence  of  type  lb  and  II  PSC  particles 
at  the  coldest  temperatures. 
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SOLVE  Data  Clusters 


Figure  13.  Temperatures  at  which  the  four  clusters  found  on  January  23  were  observed. 


Clouds  were  present  in  7.27%  of  all  nighttime  data.  The  percentage  of  each  PSC  type  found  in 
the  analyses  is  shown  in  table  10,  and  the  frequency  distributions  for  depolarization  ratio  and 
color  ratio  with  the  532  ntn  backscatter  coefficient  are  shown  in  figures  14  and  15.  The 
concentrations  of  the  aerosol  liquid-solid  mixture  and  the  type  II. 

Table  10.  Percentage  of  the  total  number  of  nighttime  data  points  for  each  PSC  type  found 
in  the  analysis. 


PSC  Type 

Total  Amount  of  Observations 

Percentage 

Aerosol  liquid/solid  mixture 

8886 

48.62 

la 

4714 

25.80 

lb 

2609 

14.28 

la-enhanced 

1946 

10.65 

II 

120 

0.66 
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Figure  14.  Frequency  distribution  for  5a  versus  log(p532).  (Colors  correspond  to  total 
amount  of  observations  in  each  bin.) 
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Figure  15.  Frequency  distribution  for  p532/  p1064  versus  log(P532).  (Colors  correspond 
to  total  amount  of  observations  in  each  bin.) 

PSCs  (48.62%  and  0.66%,  respectively,  of  the  PSC  data)  are  fairly  typical  of  data  taken  in  the 
northern  hemisphere  because  the  temperatures  usually  are  not  extremely  cold  long  enough  to 
allow  for  the  formation  of  many  Type  II  PSC  clouds.  Consistent  with  many  other  studies,  the 
type  la  PSCs  occurred  most  frequently. 
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5.  Summary 


A  classical  statistical  technique  was  used  to  objectively  identify  PSC  types  and  their 
corresponding  characteristics  with  data  acquired  by  the  LaRC  aerosol  LIDAR  during  the  SOLVE 
mission.  Combinations  of  backscatter  LIDAR  measurements  from  which  particle  abundance, 
phase,  and  size  can  be  inferred  are  used  to  classify  the  data  and  deduce  PSC  types.  The  variables 
used  in  the  analyses  are  R532,  R1064,  5a,  P532,  Pi064,  P532/P1064,  and  T.  PCA  showed  that  the  bulk  of 
the  variance  in  the  data  set  is  attributed  to  scattering.  Particle  phase  and  size  account  for  the 
second  most  weighted  component  of  the  analysis  followed  by  temperature. 

Cluster  analysis  was  used  to  classify  the  clouds  into  mutually  unknown  groups,  based  on 
combinations  of  the  derived  variables  from  the  PCA. 

It  was  found  that  using  R532,  8a,  P532,  P1064,  P532/P1064,  and  T  as  input  to  the  analysis  resulted  in  the 
best  clusters.  When  all  18,275  nighttime  data  points  were  used  as  input  to  the  analysis,  a  liquid- 
solid  aerosol  mixture  and  types  la  and  lb  were  accurately  identified. 

Another  cluster  consisting  of  both  type  la-enhanced  and  type  II  particles  was  also  found. 

The  small  relative  number  of  type  II  PSC  events  impeded  the  algorithm’s  ability  to  identify  type 
II  PSCs  as  a  separate  cluster. 

Cluster  analysis  was  also  performed  on  two  individual  days  of  the  SOLVE  mission.  January  23, 
2000  contained  85.8%  of  the  type  II  particles.  When  cluster  analysis  was  performed  on  this  day, 
the  type  II  particles  were  recognized  as  a  distinct  cluster  along  with  an  aerosol  cluster  and  types 
la-enhanced  and  lb.  The  CA  of  March  5,  2000,  which  did  not  exhibit  temperatures  cold  enough 
for  the  fonnation  of  type  II  (T  <188  K),  resulted  in  an  aerosol  cluster  and  clusters  for  types  la 
and  lb. 

The  clustering  algorithm  also  provided  a  measure  of  how  well  the  clusters  are  defined,  which  can 
be  used  to  detennine  the  most  natural  number  of  clusters  for  the  data  set.  This  measure,  referred 
to  as  the  silhouette  coefficient,  proved  to  be  insufficient  for  objectively  identifying  the  number  of 
PSC  types  present.  This  inability  may  be  attributed  to  the  PSC  particle  growth  continuum 
discussed  by  Tabazadeh  et  al.  (1994),  Tsias  et  al.  (1999),  and  Federico  et  al.  (2001),  which 
precludes  a  clear  distinction  between  the  optical  properties  of  many  of  the  PSC  types.  An 
additional  factor  that  may  contribute  to  the  low  silhouette  coefficients  may  be  the  uncertainty  in 
the  LIDAR  and  temperature  measurements. 

The  concentration  of  PSC  particles  in  the  data  may  affect  the  algorithm’s  ability  to  identify 
separate  clusters.  This  may  have  been  the  case  with  the  type  II  particles  in  this  analysis.  When 
the  entire  data  set  was  clustered,  there  were  only  176  data  points  with  R532  >  5  of  a  total  of 
18,275.  For  all  variable  combinations,  the  algorithm  failed  to  separate  type  II  particles  until  at 
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least  k  =  6.  For  instance,  for  k  =  5,  the  type  II  particles  were  grouped  with  the  type  la-enhanced 
because  they  have  similar  5a  and  P532/P1064.  When  the  data  from  January  23,  which  contained 
85.8%  of  the  type  II  particles,  were  clustered  separately,  type  II  particles  were  successfully 
identified  with  the  k  =  4  partition.  When  we  used  only  data  with  R1064  >  2  the  number  of  data 
points  for  all  the  days  decreased  to  15,179  and  the  type  II  PSCs  were  identified  with  the  k  =  5 
partition. 

It  is  likely  that  the  classification  of  type  II  PSCs  can  be  improved  with  Antarctic  data  sets  since 
they  occur  much  more  frequently  because  of  the  colder  temperatures.  This  may  reduce  the 
effects  of  type  II  PSCs  occurring  in  such  limited  numbers  with  respect  to  the  other  types  of 
PSCs.  This  research  may  also  benefit  from  the  incorporation  of  additional  variables  that  may 
help  the  algorithm  distinguish  between  the  PSC  types.  Using  aerosol  depolarization  at  both  the 
visible  and  infrared  wavelengths  may  enable  the  algorithm  to  identify  the  cloud  types  presented 
by  Toon  et  al.  (2000).  It  is  also  worth  noting  that  this  method  can  be  used  for  analyzing  large 
stratospheric  LIDAR  data  sets  such  as  the  imminent  satellite-based  LIDAR  measurements  made 
by  CALIPSO  space-borne  LIDAR. 
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